智能论文笔记

Towards Reformulating Essence Specifications for Robustness

Özgür Akgün , Alan M. Frisch , Ian P. Gent , Christopher Jefferson , Ian Miguel , Peter Nightingale , András Z. Salamon

分类：人工智能

2021-11-01

本质语言允许用户在上述抽象级别指定约束问题，在该抽象级别进行约束建模决策。 Essence规范通过魅力自动建模工具精制到约束模型，该工具采用了一套细化规则。但是，本质是一种丰富的语言，其中有许多等同的方法来指定给定的问题。因此，用户可以省略域属性或抽象类型的使用，从而产生更少的细化规则，因此可以从中选择的减少的输出模型集。本文解决了在输入精华规范的变化面前自动恢复此信息以增加输出约束模型质量的稳健性。我们提出了可以更改决策变量的类型或添加缩小其域的属性的重构规则。我们展示了这种方法在模型的数量和质量方面的功效可以与原版相比，从转化的规格中产生。

translated by 谷歌翻译

Population Template-Based Brain Graph Augmentation for Improving One-Shot Learning Classification

Oben Özgür , Arwa Rekik , Islem Rekik

分类：人工智能 | 机器学习 | 神经与进化计算

2022-12-14

The challenges of collecting medical data on neurological disorder diagnosis problems paved the way for learning methods with scarce number of samples. Due to this reason, one-shot learning still remains one of the most challenging and trending concepts of deep learning as it proposes to simulate the human-like learning approach in classification problems. Previous studies have focused on generating more accurate fingerprints of the population using graph neural networks (GNNs) with connectomic brain graph data. Thereby, generated population fingerprints named connectional brain template (CBTs) enabled detecting discriminative bio-markers of the population on classification tasks. However, the reverse problem of data augmentation from single graph data representing brain connectivity has never been tackled before. In this paper, we propose an augmentation pipeline in order to provide improved metrics on our binary classification problem. Divergently from the previous studies, we examine augmentation from a single population template by utilizing graph-based generative adversarial network (gGAN) architecture for a classification problem. We benchmarked our proposed solution on AD/LMCI dataset consisting of brain connectomes with Alzheimer's Disease (AD) and Late Mild Cognitive Impairment (LMCI). In order to evaluate our model's generalizability, we used cross-validation strategy and randomly sampled the folds multiple times. Our results on classification not only provided better accuracy when augmented data generated from one sample is introduced, but yields more balanced results on other metrics as well.

translated by 谷歌翻译

FLAGS Framework for Comparative Analysis of Federated Learning Algorithms

Ahnaf Hannan Lodhi , Barış Akgün , Öznur Özkasap

分类：机器学习 | 人工智能

2022-12-14

Federated Learning (FL) has become a key choice for distributed machine learning. Initially focused on centralized aggregation, recent works in FL have emphasized greater decentralization to adapt to the highly heterogeneous network edge. Among these, Hierarchical, Device-to-Device and Gossip Federated Learning (HFL, D2DFL \& GFL respectively) can be considered as foundational FL algorithms employing fundamental aggregation strategies. A number of FL algorithms were subsequently proposed employing multiple fundamental aggregation schemes jointly. Existing research, however, subjects the FL algorithms to varied conditions and gauges the performance of these algorithms mainly against Federated Averaging (FedAvg) only. This work consolidates the FL landscape and offers an objective analysis of the major FL algorithms through a comprehensive cross-evaluation for a wide range of operating conditions. In addition to the three foundational FL algorithms, this work also analyzes six derived algorithms. To enable a uniform assessment, a multi-FL framework named FLAGS: Federated Learning AlGorithms Simulation has been developed for rapid configuration of multiple FL algorithms. Our experiments indicate that fully decentralized FL algorithms achieve comparable accuracy under multiple operating conditions, including asynchronous aggregation and the presence of stragglers. Furthermore, decentralized FL can also operate in noisy environments and with a comparably higher local update rate. However, the impact of extremely skewed data distributions on decentralized FL is much more adverse than on centralized variants. The results indicate that it may not be necessary to restrict the devices to a single FL algorithm; rather, multi-FL nodes may operate with greater efficiency.

translated by 谷歌翻译

Bringing the Algorithms to the Data -- Secure Distributed Medical Analytics using the Personal Health Train (PHT-meDIC)

Marius de Arruda Botelho Herr , Michael Graf , Peter Placzek , Florian König , Felix Bötte , Tyra Stickel , David Hieber , Lukas Zimmermann , Michael Slupina , Christopher Mohr

分类：机器学习

2022-12-07

The need for data privacy and security -- enforced through increasingly strict data protection regulations -- renders the use of healthcare data for machine learning difficult. In particular, the transfer of data between different hospitals is often not permissible and thus cross-site pooling of data not an option. The Personal Health Train (PHT) paradigm proposed within the GO-FAIR initiative implements an 'algorithm to the data' paradigm that ensures that distributed data can be accessed for analysis without transferring any sensitive data. We present PHT-meDIC, a productively deployed open-source implementation of the PHT concept. Containerization allows us to easily deploy even complex data analysis pipelines (e.g, genomics, image analysis) across multiple sites in a secure and scalable manner. We discuss the underlying technological concepts, security models, and governance processes. The implementation has been successfully applied to distributed analyses of large-scale data, including applications of deep neural networks to medical image data.

translated by 谷歌翻译

Twitter Data Analysis: Izmir Earthquake Case

Özgür Agrali , Hakan Sökün , Enis Karaarslan

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-02

T\"urkiye is located on a fault line; earthquakes often occur on a large and small scale. There is a need for effective solutions for gathering current information during disasters. We can use social media to get insight into public opinion. This insight can be used in public relations and disaster management. In this study, Twitter posts on Izmir Earthquake that took place on October 2020 are analyzed. We question if this analysis can be used to make social inferences on time. Data mining and natural language processing (NLP) methods are used for this analysis. NLP is used for sentiment analysis and topic modelling. The latent Dirichlet Allocation (LDA) algorithm is used for topic modelling. We used the Bidirectional Encoder Representations from Transformers (BERT) model working with Transformers architecture for sentiment analysis. It is shown that the users shared their goodwill wishes and aimed to contribute to the initiated aid activities after the earthquake. The users desired to make their voices heard by competent institutions and organizations. The proposed methods work effectively. Future studies are also discussed.

translated by 谷歌翻译

Exploiting Trust for Resilient Hypothesis Testing with Malicious Robots

Matthew Cavorsi , Orhan Eren Akgün , Michal Yemini , Andrea Goldsmith , Stephanie Gil

分类：机器人

2022-09-25

我们为对抗性多机器人群众跨任务中的决策制定开发了一个有弹性的二进制假设测试框架。该框架利用机器人之间的随机信任观察，以在集中式融合中心（FC）中得出可进行的弹性决策，即使I）在网络中存在恶意机器人，其数量可能大于合法机器人的数量，并且II ）FC使用所有机器人的一次性噪声测量。我们得出两种算法来实现这一目标。第一个是两个阶段方法（2SA），该方法基于收到的信任观察估算机器人的合法性，并证明在最严重的恶意攻击中可最大程度地减少检测错误的可能性。在这里，恶意机器人的比例是已知但任意的。对于不明的恶意机器人，我们开发了对抗性的广义似然比测试（A-GLRT），该测试（A-GLRT）都使用报告的机器人测量和信任观察来估计机器人的可信赖性，其报告策略以及同时的正确假设。我们利用特殊的问题结构表明，尽管有几个未知的问题参数，但这种方法仍然可以计算处理。我们在硬件实验中部署了这两种算法，其中一组机器人会在模拟道路网络上进行交通状况的人群，但仍会受到SYBIL攻击的方式。我们从实际通信信号中提取每个机器人的信任观察结果，这些信号提供有关发件人独特性的统计信息。我们表明，即使恶意机器人在大多数情况下，FC也可以将检测误差的可能性降低到2SA和A-GLRT的30.5％和29％。

translated by 谷歌翻译

AssembleRL: Learning to Assemble Furniture from Their Point Clouds

Özgür Aslan , Burak Bolat , Batuhan Bal , Tuğba Tümer , Erol Şahin , Sinan Kalkan

分类：机器人 | 人工智能

2022-09-15

仿真环境的兴起已经实现了基于学习的组装计划的方法，否则这是一项劳动密集型和艰巨的任务。组装家具特别有趣，因为家具是复杂的，对基于学习的方法构成了挑战。令人惊讶的是，人类可以解决组装产品的2D快照。尽管近年来见证了家具组装的有希望的基于学习的方法，但他们假设每个组装步骤都有正确的连接标签，这在实践中很昂贵。在本文中，我们减轻了这一假设，并旨在以尽可能少的人类专业知识和监督来解决家具。具体而言，我们假设组装点云的可用性，并比较当前组件的点云和目标产品的点云，请根据两种措施获得新的奖励信号：不正确和不完整。我们表明，我们的新颖奖励信号可以训练一个深层网络，以成功组装不同类型的家具。可用的代码和网络：https：//github.com/metu-kalfa/assemblerl

translated by 谷歌翻译

Exploiting Pretrained Biochemical Language Models for Targeted Drug Design

Gökçe Uludoğan , Elif Ozkirimli , Kutlu O. Ulgen , Nilgün Karalı , Arzucan Özgür

分类：机器学习 | 自然语言处理 | (统计)机器学习

2022-09-02

动机：针对感兴趣的蛋白质的新颖化合物的发展是制药行业中最重要的任务之一。深层生成模型已应用于靶向分子设计，并显示出令人鼓舞的结果。最近，靶标特异性分子的产生被视为蛋白质语言与化学语言之间的翻译。但是，这种模型受相互作用蛋白质配对的可用性的限制。另一方面，可以使用大量未标记的蛋白质序列和化学化合物，并已用于训练学习有用表示的语言模型。在这项研究中，我们提出了利用预审核的生化语言模型以初始化（即温暖的开始）目标分子产生模型。我们研究了两种温暖的开始策略：（i）一种一阶段策略，其中初始化模型是针对靶向分子生成（ii）的两阶段策略进行培训的，该策略包含对分子生成的预处理，然后进行目标特定训练。我们还比较了两种生成化合物的解码策略：光束搜索和采样。结果：结果表明，温暖启动的模型的性能优于从头开始训练的基线模型。相对于基准广泛使用的指标，这两种拟议的温暖启动策略相互取得了相似的结果。然而，对许多新蛋白质生成的化合物进行对接评估表明，单阶段策略比两阶段策略更好地概括了。此外，我们观察到，在对接评估和基准指标中，梁搜索的表现优于采样，用于评估复合质量。可用性和实施：源代码可在https://github.com/boun-tabi/biochemical-lms-for-drug-design和材料中获得，并在Zenodo归档，网址为https://doi.org/10.5281/zenodo .6832145

translated by 谷歌翻译

HTML版本

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

Büşra Marşan , Salih Furkan Akkurt , Muhammet Şen , Merve Gürbüz , Onur Güngör , Şaziye Betül Özateş , Suzan Üsküdarlı , Arzucan Özgür , Tunga Güngör , Balkız Öztürk

分类：自然语言处理

2022-07-24

在这项研究中，我们旨在提供出于语言动机的解决方案，以解决缺乏无效词素的代表性，高生产力的衍生过程和土耳其语中的融合词素的问题，而在Boun Treebank中没有与普遍的依赖关系框架不同。为了解决这些问题，通过将某些引理并在UD框架中使用MISC（其他）选项卡来表示新的注释约定来表示派生。在基于LSTM的依赖性解析器上测试了重新注释的树库的代表性功能，并引入了船工具的更新版本。

translated by 谷歌翻译

Learning from few examples: Classifying sex from retinal images via deep learning

Aaron Berk , Gulcenur Ozturan , Parsa Delavari , David Maberley , Özgür Yılmaz , Ipek Oruc

分类：计算机视觉 | 机器学习

2022-07-20

深度学习对医学成像产生了极大的兴趣，特别是在使用卷积神经网络（CNN）来开发自动诊断工具方面。其非侵入性获取的设施使视网膜底面成像适合这种自动化方法。使用CNN分析底面图像的最新工作依靠访问大量数据进行培训和验证 - 成千上万的图像。但是，数据驻留和数据隐私限制阻碍了这种方法在患者机密性是任务的医疗环境中的适用性。在这里，我们展示了小型数据集上DL的性能的结果，以从眼睛图像中对患者性别进行分类 - 直到最近，底眼前图像中才出现或可量化的特征。我们微调了一个RESNET-152模型，其最后一层已修改以进行二进制分类。在几个实验中，我们使用一个私人（DOV）和一个公共（ODIR）数据源评估在小数据集上下文中的性能。我们的模型使用大约2500张底面图像开发，实现了高达0.72的AUC评分（95％CI：[0.67，0.77]）。尽管与文献中的先前工作相比，数据集大小降低了近1000倍，但这仅仅是降低25％的性能。即使从视网膜图像中进行性别分类等艰巨的任务，我们也会发现使用非常小的数据集可以进行分类。此外，我们在DOV和ODIR之间进行了域适应实验。探索数据策展对培训和概括性的影响；并调查模型结合在小型开发数据集中最大化CNN分类器性能。

translated by 谷歌翻译